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Abstract 


The critical need for accurate prediction of lung cancer, employing artificial intelligence on CT scan images 
to mitigate the high mortality rate associated with this disease. Deep learning, particularly CNNs, emerges as 
a powerful tool for achieving superior prediction accuracy compared to traditional machine learning methods. 
Leveraging a dataset comprising 3000 chest scan images across various types of lung cancer which includes 
adenocarcinoma, benign and squamous cell carcinoma the effectiveness of multiple machine learning 
algorithms is evaluated. It confirms CNN as the optimal choice for accurate prediction, with the 
implementation of VGG-19 further enhancing the assessment of lung cancer severity and precautionary 
measures. Performance analysis is obtained by using accuracy, precision, recall and loss metrics. For 


designing this application python software is used and result analysis is performed. 
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1. Introduction 

The goal of this work is to predict lung cancer using 
VGG19 transfer learning. Recent developments in 
deep learning, especially in transfer learning 
methods, have demonstrated potential to support 
medical imaging analysis-based early lung cancer 
diagnosis and detection. In this study, we explore the 
application of VGG19 transfer learning, a pre-trained 
CNN, for the prediction of lung cancer. 

The objectives are: 

e To explore the effectiveness of VGG19, a pre 
trained convolutional neural network, in the 
context of lung cancer prediction. 

e To transfer the knowledge learned by VGG19 
from ImageNet dataset to the task of lung cancer 
classification. 

e To compare the predictive performance of 
VGG19 transfer learning with _ traditional 
methods and other deep learning architectures. 

This dataset comprises 3,000 histopathological 
images categorized into three classes. The images 
were created from an initial set of 750 images of lung 
tissue that adhered to HIPAA regulations and were 
validated. This initial set included 250 images each 
of lung squamous’ cell carcinomas, lung 
adenocarcinomas, and lung benign tissue. Using the 
Augmentor package, this set was augmented to 


produce the final dataset of 3,000 images. Each class 
in the dataset contains 1,000 images.VGG-19 is a 
deep CNN with 19 layers, widely used for image 
classification. It's popular because it employs 
multiple 3 x 3 filters in each convolutional layer. 
Trained on the ImageNet database with a million 
images across 1000 categories, a pre-trained VGG-19 
model can classify images into various objects like 
keyboards, animals, etc. It achieves an impressive 
accuracy of 95% with a loss of 17%. Lung cancer 
ranks second in cancer-related deaths, posing a 
significant threat to the population. Early diagnosis, 
crucial for recovery, is emphasized by the American 
Cancer Society.Detecting lung cancer through 
medical imaging like CT scans is effective, with CT 
scans being particularly efficient. Automated 
detection offers improved results over manual 
checks. CNNs are recognized as effective tools for 
image-based prediction. Types of lung cancer 
mentioned in the paper are large cell carcinoma, 
benign, squamous cell carcinoma with different 
consequences as well as different characteristics. 
1.1. Lung Adenocarcinoma 

Lung adenocarcinoma is the most common type of 
lung cancer, making up approximately 30% of all 
cases and 40% of non-small cell lung cancer cases. 
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This type of cancer is also found in other organs like 
the colon, prostate, and breast. Adenocarcinomas in 
the lungs develop in the glands responsible for aiding 
breathing and mucus production. Symptoms may 
include weakness, weight loss, hoarseness, and 
coughing. [5] Figure 1 shows the Types of Lung 
Cancer Pathology Images 
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Figure 1 Types of Lung Cancer Pathology 
Images 


1.2. Squamous Cell Carcinoma 

Squamous cell lung cancer develops internally within 
the lung, often in the larger bronchial tubes 
connecting the trachea to the lung, or in major airway 
branches. [8] Various medical imaging techniques 
such as MRI, X-rays, and CT scans are used to detect 
lung cancer, with CT scans being particularly 
efficient. Automated detection offers improved 
results over manual checks. Convolutional Neural 
Networks are recognized as one of the most effective 
methods for image-based prediction among existing 
approaches. [9] 

2. Methodology 

Prediction of Lung Cancer Using VGG19 Transfer 
Learning is performed in proposed model. For given 
dataset, results of VGG-19 are analyzed and different 
graphs are plotted for results validation. 

2.1 Dataset Details 

There are 3,000 histological pictures in this 
collection, divided into three groups. Every image is 
768 by 768 pixels and is stored in a jpeg file format. 
750 original samples of HIPAA-compliant, vetted 
sources were used to create the dataset, which was 
then enhanced to 3,000 pictures of lung tissue using 
the Augmentor software. There are 1,000 photos in 
each of the three classifications in the dataset. 


2.2 Algorithm/Methodology 
2.2.1 VGG-19 Algorithm for Lung Image 
Classification 

VGG-19 is a deep convolutional neural network 
(CNN) composed of 19 layers, renowned for its 
effectiveness in image classification tasks. It employs 
multiple 3 x 3 filters in each convolutional layer, 
contributing to its ability to extract intricate features 
from images.[10] Trained on the ImageNet database 
containing a vast array of images spanning 1000 
categories, a pre-trained VGG-19 model can classify 
images with remarkable accuracy. With an accuracy 
of 95% and a loss of 17%, VGG-19 outperforms 
many existing methods, with fewer training samples 
required, boasting faster training speeds, and higher 
accuracy. It is a combination layer like convolutional 
in the quantity of 16 as well as a layer fully connected 
with the quantity of 3, network architecture of 
VGGI19 is characterized by its utilization of small 
convolutional filters, enabling it to effectively 
capture detailed features within images. 
2.2 Architecture of VGG-19 

2.2.1 Input Image 
The VGGNet architecture is designed to accept input 
images of size 224x224 pixels. Each image with 
patch of 224x224 is cropped in standard creation of 
architecture of VGGNet model. This approach 
ensures consistency in input size across all images 
processed by the model, facilitating efficient training 
and classification. 

2.2.2 Convolution Layer 
To extract horizontal and vertical features more 
precisely, architecture of VGG model select the 
compact field for operation as 3x3. Additionally, 1x1 
convolution filters serve as a linear transformation of 
the input, contributing to the network's capacity to 
extract diverse features. Following — each 
convolutional layer is a Rectified Linear Unit 
(ReLU), which introduces non-linearity to the 
network and speeds up training, a_ significant 
improvement from AlexNet.[12] The rectified linear 
unit activation function, or ReLU, improves the 
network's capacity to recognise intricate patterns by 
producing zero if the input is negative and the input 
if it is positive. In order to preserve spatial resolution 
following convolution and guarantee detailed feature 
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extraction while shifting over the input matrix, the 
convolution stride is fixed at 1 pixel. 

2.2.3. Hidden Layers 
In the VGG network, Rectified Linear Unit (ReLU) 
activation functions are applied across all hidden 
layers. Unlike some other architectures, VGG 
typically does not utilize Local Response 
Normalization (LRN) due to its tendency to increase 
both memory usage and training duration. 
Furthermore, LRN does not typically contribute to 
enhanced accuracy in VGG networks. 

2.2.4 Fully-Connected Layers 
Three completely connected layers make up the 
VGGNet architecture. There are 4096 channels in 
each of the layers with the first two next, and in the 
third layer 1000 channels. A channel is allocated to 
each class in the dataset. 
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Figure 2 VGG-19 Architecture 
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The architecture of VGG-19 is composed of five 
blocks containing 16 convolutional layers. Every 
block is followed by a Maxpool layer that halves the 
size of the input picture and twice the number of 
filters in the next convolutional layer. Block 6's last 
three dense layers are 4096, 4096, and 1000 in size, 
respectively. [14] Usually, VGG is trained to classify 
data into 1000 categories, hence the final dense 
layer's (fc8) dimension is 1000. However, to 
accommodate the binary classification job in this 
study with only two output classes, the dimension of 
fc8 is set to two. 

3. Results and Discussion 

The VGG19 model is being used to assess individual 
images for the ability to forecast the type of lung 
cancer they contain, as its performance continues to 
improve. In order to determine the type of lung 


cancer, the user can choose any image from the 
dataset for testing. Figure 3 shows the process of the 
dataset. 
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Figure 3 Process of the Dataset 


3.1 Output Screenshots 

The output screenshots of Figure 4 show the 
homepage, Figure 5 shows Login page, Figure 6 
shows the performance analysis, Figure 7 shows 
Accuracy& loss graph, Figure 8 shows Precision& 
recall graph, Figure 9 Shows Physician registration, 
Figure 10 Physician login, Figure 11 shows Frontend 
page for Lung Cancer Prediction, Figure 12 shows 
Prediction Results obtained is “Adenocarcinoma” 


Prediction off TT: 
using\ 

\eedming | 

Na , in 


4 
7 


sf 


Figure 4 Homepage 
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Figure 9 Physician Registration 
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Figure 6 Performance Analysis 
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Figure 7 Accuracy & Loss graph 
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The proposed model is analyzed using the VGG-19 
model for lung cancer prediction using Python 
software and different relevant libraries. 
Conclusion 
Squamous cell lung cancer develops internally within 
the lung, often in the larger bronchial tubes 
connecting the trachea to the lung, or in major airway 
branches. [8] Various medical imaging techniques 
such as MRI, X-rays, and CT scans are used to detect 
lung cancer, with CT scans being particularly 
efficient. Automated detection offers improved 
results over manual checks. Convolutional Neural 
Networks are recognized as one of the most effective 
methods for image-based prediction among existing 
approaches. [9] 
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